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Abstract 

In this paper we study a random graph with N nodes, where node j has degree Dj and 
' {Dj}jLi are i.i.d. with P(I?j < a;) = Fix). We assume that 1 — F{x) < cx~'^+^ for some r > 3 

and some constant c > 0. This graph model is a variant of the so-called configuration model, 
I and includes heavy tail degrees with finite variance. 

P I ' The minimal number of edges between two arbitrary connected nodes, also known as the 

graph distance or the hopcount, is investigated when N —^ oo. We prove that the graph distance 
grows like log^, A^, when the base of the logarithm equals v = ¥,[Dj{Dj — l)]/E,[Dj] > I. This 
confirms the heuristic argument of Newman, Strogatz and Watts In addition, the random 
fluctuations around this asymptotic mean logj^ N are characterized and shown to be uniformly 
bounded. In particular, we show convergence in distribution of the centered graph distance along 
exponentially growing subsequences. 

> 

(N 

^ ■ 1 Introduction 

I The study of complex networks plays an increasingly important role in science. Examples of such 

■ networks are electrical power grids and telephony networks, social relations, the World-Wide Web 
. and Internet, co-authorship and citation networks of scientists, etc. The structure of these networks 
I affects their performance. For instance, the topology of social networks affects the spread of infor- 

■ mation and disease (see e.g., [SZl)- The rapid evolution in, and the success of, the Internet have 
incited fundamental research on the topology of networks. 

Different scientific disciplines report their own viewpoints and new insights in the broad area 
of networking. In computer science and electrical engineering, massive Internet measurements have 
lead to fundamental questions in the modelling and characterization of the Internet topology |22|l38j. 
5^ I These modelling questions drive the understanding of the Internet's complex behavior and allow to 
plan and to control end-to-end communication. The pioneering work of Strogatz and Watts (see 
e.g. |371 141j and the references therein) have triggered an immense number of research papers in 
the field of theoretical physics. Strogatz and Watts proposed 'small world networks' and illustrated 
how such small worlds can arise due to underlying mechanisms in different practical networks such 
as social networks, growing structures in nature, the Web, etc. 

Albert and Barabasi in |2| showed that preferential attachment of nodes gives rise to a class 
of graphs often called 'scale free networks'. See also |1J |H] and the references therein. Scale free 
networks seem to explain the structure of the World-Wide Web, the autonomous domain structure 
of Internet, citation graphs and many other complex networks (see e.g., jlJISSl)- The essence of scale 
free networks is that the nodal degree is a power law, or, alternatively, heavy-tailed, meaning that 
the number of nodes with degree equal to k is proportional to k~'^ for some power exponent r > 1. 
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On the World-Wide Web, it has indeed been shown that there are power law degree sequences, both 
for the in- and out degrees (see |16|l29j). The work of Albert and Barabasi have inspired substantial 
work on scale-free graphs and can be seen as a way to understand the emergence of power law degree 
sequences. In the model by Albert and Barabasi f3^, this power exponent is restricted to r = 3 
jl4j . but in refinements of the model, different values of r can be obtained. See, e.g., pi IIUI IT^ l3Uj 
and the references therein. We will comment on the relations between our work and preferential 
attachment models in Section 11.41 below. For an overview of the extensive field of random graphs, 
we refer to the books of Bollobas [SI and Janson et al. '28'. 

The current paper presents a rigorous mathematical derivation for the random fluctuations of 
the graph distance between two arbitrary nodes in a graph with finite variance degrees. These finite 
variance degrees include power laws with power exponent r > 3. We consider the configuration 
model with power law degree sequences, a variation on a model originally proposed by Newman, 
Strogatz and Watts j^, prove their conjecture and proceed beyond their results by combining 
coupling theory, branching processes and shortest path graphs. 



1.1 Model definition 

Fix an integer A^. Consider an i.i.d. sequence Di, D2, ■ ■ ■ , D^f. We will construct an undirected 
graph with N nodes where node j has degree Dj. We will assume that Ljv = X^jLi is even. If 
Ljv is odd, then we add a stub to the A^**^ node, so that Dj^ is increased by 1. This single stub will 
make hardly any difference in what follows, and we will ignore this effect. We will later specify the 
distribution of Di. 

To construct the graph, we have A^ separate nodes and incident to node j, we have Dj stubs. 
All stubs need to be connected to build the graph. The stubs are numbered in a given order from 
1 to Ljv. We start by connecting at random the first stub with one of the Ljv — 1 remaining stubs. 
Once paired, two stubs form a single edge of the graph. Hence, a stub can be seen as the left or the 
right half of an edge. We continue the procedure of randomly choosing and pairing the stubs until 
all stubs are connected. Unfortunately, nodes having self-loops may occur. However, self-loops are 
scarce when N ^ 00. 

We now specify the degree distribution we will investigate in this paper. The probability mass 
function and the distribution function of the nodal degree D are denoted by 

nD=j) = fj, j = 0,1,2,..., and F(x)=J^/,-, (1.1) 

i=o 

where [x\ is the largest integer smaller than or equal to x. Our main assumption is that for some 
r > 3 and some positive constant c, 

1 - F(x) < cx"^+^ (x>0). (1.2) 

This condition implies that the second moment of D is finite. The often used condition that 
1 — F{x) = x~^~^^L{x), 7 > 3, with L a slowly varying function is covered by H1.2|) . because by 
Potter's Theorem |23| Lemma 2, p. 277], any slowly varying function L{x) can be bounded above 
and below by an arbitrary small power of x, so that (|1.2|) holds for any r < 7. 

The above model is closely related to the so-called configuration model, in which the degrees of 
the nodes are often assumed to be fixed (rather than i.i.d.). See |33| Section 4.2.1] and the references 
therein. We will review some results proved for the configuration model in Section fl.4l below. 



1.2 Main results 

We denote 



, = E[«i, „ = (1.3) 
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and we define the distance or hopcount Hj^ between the nodes 1 and 2 as the minimum number of 
edges that form a path from 1 to 2 where, by convention, the distance equals oo if nodes 1 and 2 
are not connected. Since the nodes are exchangeable, the distance between two randomly chosen 
nodes is equal in distribution to Hj^. Our main result is the following theorem: 

Theorem 1.1 (Limit law for the typical nodal distance) Assume that t > 3 in hlJ^ and 

that u > 1. For k > 1, let = \^og^J k\ — logj^ k G (—1, 0] . There exist random variables {Ra)ae{-ifl] 
such that as N ^ oo, 

F{Hj^ - [log^ N\ =k\Hj, <oo) =F{Ra^ = k) + o{l), k e Z. (1.4) 

In words. Theorem 11.11 states that for r > 3, the graph distance Hp, between two randomly 
chosen connected nodes grows like the log^, A^, where N is the size of the graph, and that the 
fluctuations around this mean remain uniformly bounded in N. Theorem 11.11 proves the conjecture 
in Newman, Strogatz and Watts {35J Section II. F, (54)], where a heuristic is given that the number 
of edges between arbitrary nodes grows like logj^ A^. In addition, Theorem 11.11 improves upon that 
conjecture by specifying the fluctuations around the value log^ A^. 

We will identify the laws of {Ra)ae{-i,o] ™ Theorem 11.41 below. Before doing so, we state two 
consequences of the above theorem: 

Corollary 1.2 (Convergence in distribution along subsequences) Fix an integer Ni. Un- 
der the assumptions in Theorem M.IV and conditionally on Hj^ < oo, along the subsequence Ni^. = 
[Niu'^^^l, the sequence of random variables Hj^^ — [logj^ A'^fcJ converges in distribution to Raj^^ as 
k oo. 

Simulations illustrating the convergence in Corollarv 11.21 are discussed in Section [1.51 
Corollary 1.3 (Concentration of the hopcount) Under the assumptions in Theorem M.lV 

(i) with probability 1 — o(l) and conditionally on Hj^ < oo, the random variable Hpf is in between 
(lie) log^ N for any e > 0; 

(ii) conditionally on H^^ < oo, the random variables Hj^ — \og^N form a tight sequence, i.e., 

lim limsupP(|i7iv -log^A^I < K\H,, < oo) = 1. (1.5) 

We need a limit result from branching process theory before we can identify the limiting random 
variables (-Ra)ae(-i.o] • Section El below, we introduce a delayed branching process {2k}, where 
in the first generation, the offspring distribution is chosen according to (jl.lj) and in the second and 
further generations, the offspring is chosen in accordance to g given by 

.. = ^^^^, . = 0,1,.... (1.6) 

The process {Zj^/ is a martingale with uniformly bounded expectation and consequently 
converges almost surely to a limit: 

lim = W a.s. (1.7) 

In the theorem below we need two independent copies W'^' and W'^^ of W. 

Theorem 1.4 (The limit laws) Under the assumptions in Theorem M.lV and for a E (—1,0], 

P(i?a > k) =E[exp{-Kz^'^+'=W(^>W(''}|W(^'W' > O] , (1.8) 
where W'^' and W*^^ are independent limit copies ofW in \1.'7\) and where k = ^{u — 1)""*^. 
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We will also provide an error bound of the convergence stated in Theorem 11.11 Indeed, we show 
that for any a > 0, and for all k < ijlog^ N for some 77 > sufficiently small, 

F{H^ > [log^ N\+k)= E( exp{-A€z^"^+^W<'> W*"'}) + 0((log iV)""). (1.9) 

Unfortunately, due to the conditioning in Theorem II. IL it is hard to obtain an explicit error bound 
in (|ri)) . 

The law of Ra is involved, and can in most cases not be computed exactly. The reason for this is 
the fact that the random variables W that appear in its statement are hard to compute explicitly. 
For example, for the power-law degree graph with r > 3, we do not know what the law of W is. See 
also Section [21 There are two examples where the law of W is known. The first is when all degrees 
in the graph are equal to some r > 2, and we obtain the r-regular graph (see also [15j . where the 
diameter of this graph is studied). In this case, we have that // = r, = r — 1, and W = 1 a.s. In 
particular, ¥{H]y < 00) = 1 + o(l). Therefore, we obtain that 

P(K>fc)=exp{ '—{r-ir+^}, (1.10) 

r — 2 

and Hj^ is asymptotically equal to log^_i N. The second example is when the law g is geometric, 
in which case the branching process with offspring g conditioned to be positive converges to an 
exponential random variable with parameter 1. This example corresponds to 

9j=p{l-py-\ so that fj = J-p{l-py-\ Vj>l, (1.11) 

and Cp is the normalizing constant. For p > ^, the law of W has the same law as the sum of Di 
copies of a random variable 3^, where 3^ = with probability and equal to an exponential 

random variable with parameter 1 with probability Even in this simple case, the computation 

of the exact law of Ra is non-trivial. Although the laws Ra are hard to compute exactly. Theorems 
11.11 and II .41 make it possible to simulate the hopcount in random graphs of arbitrary size since the 
law of W is simple to approximate numerically, for example using Fast Fourier Transforms. 

In ITT*, the expected value of the random variable Ra is computed numerically, by comparing it 
to EpogWlW > 0]. One would expect that for some (3 with < P < a, 

E[H^\H^ < 00] = Llog.A^J + E[Ra] + 0{{logN)-''). (1.12) 

If so, an accurate computation of ]E[i?(i] would yield the fine asymptotics of the expected hopcount, 
and this would yield an extension of the conjectured results in |35| (54)]. Our methods stop short 
of proving (|1.12jl . and this remains an interesting question. 

Our final result describes the size of the largest connected component and the maximal size of 
all other connected components. In its statement, we write G for the random graph with degree 
distribution given by 1)1.1(1 . and we write q for the survival probability of the delayed branching 
process {Zf^} described above. Thus, 1 — g is the extinction probability of the branching process. 

Theorem 1.5 (The sizes of the connected components) With probability 1 — o(l), the largest 
connected component in G has qN{l + o(l)) nodes, and there exists 7 < 00 such that all other 
connected components have at most 7 log N nodes. 

1.3 Methodology and heuristics 

One can understand Theorems 11.11 and II .41 intuitivelv as follows. Denote by Zj^\ respectively, Zj^^ 
the number of stubs of nodes at distance k — 1 from node 1, respectively, node 2 (see Section 
for the precise definitions). Then for N ^ 00, the random process Z'^\ \ ■ ■ ■ , Zj^\ which will 
be called shortest path graphs (SPG's), behave as a delayed branching process as long as Z^'' is of 
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small order compared to A^. Thus, the local neighborhood of the node i is close in distribution to 
a branching process. 

We sample the stubs uniformly from all stubs and thus, for large A^, we attach the stubs to the 
SPG proportionally to jfj. Moreover, when a new stub is attached to the SPG, the chosen stub 
is used to attach the new node and forms an edge together with the present stub. Therefore, the 
number of stubs of the freshly chosen node decreases by one and is equal to j if the number of stubs 
of the chosen node was originally equal to j + 1. This motivates 

The offspring of the node 1 is distributed as Di , whereas the offspring distribution of Zg^' , Zg^' , . . . 
has (for N — > oo) probability mass function Consequently, as noted in (35,^ (51)], the mean 

number of free stubs at distance k is close to ^v^~^, where v = Yl'^iidj defined in ()1.3|) . 
Moreover, a stub in Zj}^ is attached with a positive probability to a stub in Zj^^ whenever Z^^Z''^^ 
is of order Ljy. The total degree Ljv is proportional to by the law of large numbers, because 
H = E[Di] < cxD. Since both sets grow at the same rate, each has to be of order y/N. Therefore, 
k is typically ^ log^ N, and the typical distance between 1 and 2 is of order 2k = log^ N. This 
can be made precise by coupling Z'^\ Z^^^ ... to a branching process ■^2^\ • • • having offspring 
distribution g^^^ given by 

l=\ 1=1 

where l\E\ is the indicator of the event E. This coupling will be described in Section [3.11 In turn, 
the branching process ■^2^\ • • • will be coupled, in a conventional way, to a branching process 
2.^y \ 1 ■ ■ ■ with offspring distribution {gj} defined in ()1.6|) . The limit result of Theorem 1 1 . 1 1 and 
Theorem 1 1 . 41 dep ends on the martingale limit for super-critical branching processes with finite mean. 

The proof of Theorems 11.11 and 11.41 are based upon a comparison of the local neighborhoods of 
nodes to branching processes. Such techniques are used extensively in random graph theory. An 
early example is in where the diameter of a random regular graph was investigated. See also 
[HI Ghapter 10], where comparisons to branching processes are used to describe the phase transition 
and the birth of the giant component for the random graph G{p, N). 

The proof of Theorem 1 1 . 51 makes essential use of results by Molloy and Reed for the usual 

configuration model. We will now describe their result. When the number of nodes with degree 
i in the graph of size A^ equals di{N) where limAr^oo di(A^)/A^ = Q{i), Molloy and Reed |3H I32j 
identify the condition Yl^i ^(^ ~ 2)(5(i) > as the necessary and sufficient condition to ensure that 
a 'giant component' proportional to the size of the graph exists. By rewriting the condition v > 1 
in Theorem 11.11 as E[Z)'^] — 2E[L'] > 0, we see that a similar condition as in the model of Molloy 
and Reed is needed here. To prove Theorem 11.51 we need to check that the technical conditions 
in |3H 132j are satisfied in our model. In fact, we need to alter the graph G a little bit in order to 
apply their results, since in |3Tj it is assumed that no nodes of degree larger than A^^""^ exist for 
some e > 0. 

The novelty of our results is that we investigate typical distances in random graphs. In random 
graph theory, it is more customary to investigate the diameter in the graph, and in fact, this would 
also be an interesting problem. The research question investigated in this paper is inspired by 
the Internet. In a seminal paper |^, Faloutsos et al. have shown that the degree distribution 
of autonomous systems in Internet follows a power law with power exponent r ~ 2.2. Thus, the 
power law random graph with this value of r can possibly lead to a good Internet model on the 
autonomous systems (AS) level (see [221 138| ). For the Internet on the more detailed router level, 
extensive measurements exist for the hopcount, which is the number of routers traversed between 
two typical routers, as well as for the AS-count, which is the number of autonomous systems 
traversed between two typical routers. To validate the configuration model with i.i.d. degrees, we 
intend to compare the distribution of the distance between pairs of nodes to these measurements 
in Internet. For this, a good understanding of the typical distances between nodes in the degree 
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random graph are necessary, which formed the main motivation for our work. The hopcount in 
Internet seems to be close to a Poisson random variable with a fairly large parameter. In turn, a 
Poisson random variable with large parameter can be approximated by a normal random variable 
with equal expectation and variance. See e.g. |34| 1401 for data of the hopcount in Internet. 

From a practical point of view, there are good reasons to study the typical distances in random 
graphs rather than the diameter. For one, typical distances are simpler to measure, and thus allow 
for a simpler validation of the model. Also, the diameter is a number, while the distribution of the 
typical distances contains substantially more information. Finally, the diameter is rather sensitive 
to small changes to a graph. For instance, when adding a string of a few nodes, one can dramatically 
alter the diameter, while the typical distances in the graph hardly change. Thus, typical distances 
in the graph are more robust to modelling discrepancies. 

1.4 Related work 

There is a wealth of related work which we will now summarize. The model investigated here was also 
studied in |36j . with 1 — F{x) = x~'^~^^L{x), where r G (2, 3) and L denotes a slowly varying function. 
It was shown in j^H] that the average distance is bounded from above by 2 |iog(°^2) | + ^i^))- We 
plan to return to the question of average distances and connected component sizes when r < 3 in 
three future publications [23 HSJ HB] . 

There is substantial work on random graphs that are, although different from ours, still similar 
in spirit. In P, random graphs were considered with a degree sequence that is precisely equal to a 
power law, meaning that the number of nodes with degree k is precisely proportional to k~'^. Aiello 
et al. ^ show that the largest connected component is of the order of the size of the graph when 
r < To = 3.47875 . . ., where tq is the solution of C(r — 2) — 2C(r — 1) = 0, and where C, is the Riemann 
Zeta function. When r > tq, the largest connected component is of smaller order than the size of 
the graph and more precise bounds are given for the largest connected component. When r G (1, 2), 
the graph is with high probability connected. The proofs of these facts use couplings with branching 
processes and strengthen previous results due to Molloy and Reed [^IHl] described above. For this 
same model, Dorogovtsev et al. jSHl investigate the leading asymptotics and the fluctuations 
around the mean of the distance between arbitrary nodes in the graph from a theoretical physics 
point of view, using mainly generating functions. 

A second related model can be found in 17, and ^Sj, where edges between nodes i and j are 
present with probability equal to WiWj/ wi for some 'expected degree vector' w = {wi, . . . , Wm)- 

Chung and Lu J7] show that when Wi is proportional to i~^^ the average distance between pairs 
of nodes is logj^ A^(l + o(l)) when r > 3, and 2 (1 + o(l)) when r e (2,3). The difference 
between this model and ours is that the nodes are not exchangeable in fT7|, but the observed 
phenomena are similar. This result can be heuristically understood as follows. Firstly, the actual 
degree vector in jTT] should be close to the expected degree vector. Secondly, for the expected 
degree vector, we can compute that the number of nodes for which the degree is less than or equal 
to k equals 

\{i : Wi < k}\ oc \{i : i'^ < k}\ « A;"^+^ 

Thus, one expects that the number of nodes with degree at most k decreases as k^'^^^, similarly 
as in our model. In ^H], Chung and Lu study the sizes of the connected components in the above 
model. The advantage of this model is that the edges are independently present, which makes the 
resulting graph closer to a traditional random graph. 

All the models described above are static, i.e., the size of the graph is fixed, and we have not 
modeled the growth of the graph. As described in the introduction, there is a large body of work 
investigating dynamical models for complex networks, often in the context of the World-Wide Web. 
In various forms, preferential attachment has been shown to lead to power law degree sequences. 
Therefore, such models intend to explain the occurrence of power law degree sequences in random 
graphs. See [21 il Cni Ell CHI El IHl EOl and the references therein. In the preferential 
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attachment model, nodes with a fixed degree m are added sequentially. Their stubs are attached to 
a receiving node with a probability proportionally to the degree of the receiving node, thus favoring 
nodes with large degrees. For this model, it is shown that the number of nodes with degree k decays 
proportionally to ^3], the diameter is of order when m > 2 and couplings to a 

classical random graph G{N,p) are given for an appropriately chosen p in See also ^21 for a 
survey. 

It can be expected that our model is a snapshot of the above models, i.e., a realization of the 
graph growth processes at the time instant that the graph has a certain prescribed size. Thus, 
rather than to describe the growth of the model, we investigate the properties of the model at a 
given time instant. This is suggested in |1J Section VII. D], and it would be very interesting indeed 
to investigate this further mathematically, i.e., to investigate the relation between the configuration 
and the preferential attachment models. 

The reason why we study the random graphs at a given time instant is that we are interested in 
the topology of the random graph. In [SH, and inspired by the observed power law degree sequence 
in [22j, the configuration model with i.i.d. degrees is proposed as a model for the AS-graph in 
Internet, and it is argued on a qualitative basis that this simple model serves as a better model for 
the Internet topology than currently used topology generators. Our results can be seen as a step 
towards the quantitative understanding of whether the hopcount in Internet is described well by 
the average graph distance in the configuration model. 

In [,V^\ Table II], many more examples are given of real networks that have power law degree 
sequences. Interestingly, there are also many examples where power laws are not observed, and 
often the degree law falls off faster than a power law. These observed degrees can be described by a 
degree distribution as in with 1 — F(x) smaller than any power, and the results in this paper 
thus apply. Such examples are described in more detail in |^ Section II]. Examples where the tails 
of the degree distribution are lighter than power laws are power and neural networks ^ Section 
ILK], where the tails are observed to be exponential, and protein folding Section ILL], where the 
tails are observed to be Gaussian. In other examples, a degree distribution is found that for small 
values is a power law, but has an exponential cut off. An example of such a degree distribution is 

fk = Ck-^e-'/^, (L14) 

for some k > and 7 € M. The size of k indicates up to what degree the power law still holds, 
and where the exponential cut off starts to set in. For this example, our results apply since the 
exponential tail ensures that H1.2|) holds for any r > 3 by picking c > large enough. Thus, we 
prove the conjectures on the expected path lengths in j3Sl (55), (56)] and A, Section V.C, (63) and 
(64)] for this particular model. 

1.5 Simulation for illustration of the main results 

To illustrate Theorem 11.11 we have simulated the random graph with degree distribution D = 
\U~^^~\, where U is uniformly distributed over (0,1) and where for x G M, [x] is the smallest 
integer greater than or equal to x. Thus, 

1 - F{k) = F{U'^ > k) = k^-", k = 1,2,3,..., 

for which fi = 1 + C(r — 1) and 1^ = 2C,{t — 2)1 [i. 

We observe that for r = 3.5 and N = 25,000 and N = 125,000, the values Ojv = -0.62... 
are identical up to two decimals. We hence expect, on the basis of our main theorem, that the 
survival functions P(-ffjv > k) for these two cases are similar. Because [log^, 25, OOOJ = 12 and 
[logj, 125, OOOj = 14, we expect that the empirical survival function for N = 125, 000 is a shift of 
the empirical survival function for N = 25,000, over the horizontal distance 14 — 12 = 2. Figure Q 
supports this claim, given the statistical inaccuracy. In Figure Q we have also included the empirical 
survival function for N = 75, 000, for which Cat = —0.99 . . ., as the bold line. This empirical survival 
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Figure 1: Empirical survival functions of the hopcount for r = 3.5 and the values N = 25,000, 
N = 75, 000 (bold) and N = 125, 000, based on samples of size 1, 000. 



function clearly has a different shape. Thus, the empirical survival function for N = 75, 000 is not 
a shift of the empirical survival function for = 25, 000 or = 125, 000. 

We finally demonstrate Corollary 11.21 for r = 3.5 in Figure |2 In this case i/^ ~ 5 and Nf^ = 
Niv'^^, fc = 0, 1, 2, 3. We take iVi = 5, 000, and so N2 = 25, 000, A^s = 125, 000, = 625, 000. For 
these values of A'^i, . . . , we have simulated the hopcount with 1, 000 replications and we expect 
from Corollarv 11.21 that the survival functions run parallel at mutual distance 2. 

1.6 Organization of the paper 

We will first review the relevant literature on branching processes in Section |21 We will then explain 
how we can couple our degree model to independent branching processes in Sectional This section 
is also valuable for our coming paper [2^, where we study the case r € (2,3). In particular, in 
[2lj , we will use Lemmas IA.2.21 and IA.2.81 and Proposition IA.3.11 The bounds for the coupling are 
formulated in Sections 13. H 13.21 and 13.31 In these sections, we will state the results on the coupling 
that are needed in the proof of the main results. Theorems ll.ll and ll.4l Parts of this section apply 
more generally, i.e., to r € (2,3). We prove Theorems 11.11 and 11.41 in Section |1] and Theorem 11.51 
in Section |S1 The technical details of the coupling of {-^^''j to {2^^*'} for i = 1,2 are contained 
in Section lA.ll while the details of the coupling of {Zj^^} to {Z^^} for i = 1,2 are in Section [Ql 
Finally, we prove that at any fixed time m, with probability converging to 1, Zm = 2m for z = 1, 2 
in Section IA. 31 

2 Review of branching process theory with finite mean 

Since we rely heavily on the theory of branching processes, we will briefly review this theory in 
the case where the expected value of the offspring distribution is finite. The theory of branching 
processes is well understood (see e.g. IZj). 
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Figure 2: Empirical survival functions of the hopcount for r = 3.5 and the four values Ni^ = 
5,000z/2^, k = 0,1,2,3, based on 1,000 runs. 



For the formal definition of the delayed branching process (BP) that we consider here, we define a 
double sequence {Xn,i}n>i,i>i of i.i.d. random variables each with distribution equal to the offspring 
distribution {gj}j%Q, where we recall 

.. = ^^^^, . = 0,1,.... (2.1) 

We further let Xq^i have probability mass function / in (|l.lj) . independently from {Xn.i}n>i,i>i- 
The BP {Zn} is now defined by = 1 and 

Zn+l = ^Xn,i, n > 0. 
i=l 

Because r > 3, we have that both ]E[^i] = E[Xo^i] = fi < oo and ly = E[Xi i] < oo. We further 
assume that = E[Xi^i] > 1, so that the BP is super-critical. Given that the (n — 1)*^* generation 
consists of m individuals, the conditional expectation of Zn equals nu', independently of the size of 
the preceding generations, so that for n > 1, we have E[Z„|Z„_i] = Zn-ii^- Hence, Wn = ^^n-i , is 
a martingale. Since E[|>V„|] = E[>Vn] = 1, the sequence E[|Wn|] is uniformly bounded by 1 and so 
by Doob's martingale convergence theorem j42| p. 58] the sequence Wn converges almost surely. If 
we denote the a.s. limit by a proper random variable W, we obtain (jl.Tf) . 

There are only few examples where the limit random variable W is known. It is known that W 
has an atom at of size p >0, equal to the extinction probability of the (delayed-)BP {q = 1 — p). 
Conditioned on non-extinction the limit W has an absolute continuous density on (0,cxd). 

We need a result that follows from jH) concerning the speed of convergence of yV„ to W. Define 

TZn = — - / xdG{x), a > 0, 
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where G is the distribution function of the offspring with probabihties {gj}- Since 

/•oo 

jjia = j x [log x]" (iG(a;) < OO, (log^ x = max(0, logx)), 

JO 

for each a > 0, it follows from ([HI page 8, line 4]) that with probability 1, 

oo 

W-Wk + Y.nn = o{k-^). (2.2) 



n=k 



An immediate consequence of (|2.2|) is that if jW — Wfc| > k then Yl'^=k'^n > k Hence, using 
IE[Wrj] = 1 and partial integration, 

(oo \ oo °° foo 

V7^„>A;-" < A;" VE[7^„] = - V — / xd[l-G{x)] 

n=k J n=k n=k '^'^'V™" 

CXD 1 Q, OO 1 Q, poo 

^-[1- G(z.7n")] + E - / [1 - G(x)] dx. 



n=k n=k 



Since 1 — F(x) < c • x^ (see H1.2|) ). we find 1 — G(x) < c' • x^ so that for each a > 0, and with 
fe= [ilog.iVj, 

oo 

P(|W-Wfc| > (logiV)-") < 0((logiV)")^(zv"/n")3-^ = 0(e-^'°s^) = ©(iV"^), (2.3) 

n=k 

for some positive /3, because r > 3 and > 1. 

3 Graph construction and coupling with a BP 

In this section, we will describe how the shortest path graph (SPG) from node 1 can be obtained, 
and we will couple it to a BP. This coupling works for any degree distribution. In Sections 13.21 and 
13.31 below, we will obtain bounds on the coupling. 

The SPG from node 1 is the random graph as observed from node 1, and consists of the shortest 
paths between node 1 and all other nodes {2, . . . , N}. As will be shown below, it is not necessarily 
a tree because cycles may occur. Recall that two stubs together form an edge. We define Z[^^ = Di, 
and for A: > 2, we denote by Z'j^^ the number of stubs attached to nodes at distance k — 1 from node 
1, but are not part of an edge connected to a node at distance k — 2. We will refer to such stubs as 
'free stubs'. Thus, Z^^^^ is the number of outgoing stubs from nodes at distance k — 1. 

In Section [3. II we will describe a coupling that, conditionally on Di, . . . , Dj^,, couples {Zj^^} to 
a BP {Z^f^^} with the random offspring distribution 

N 

= I[Di = j + l]P(a stub from node i is sampled|Z)i, . . . , Djv) 
1=1 

= ^/[A=j + l]-^ = ^j;/[A=j + l], (3.1) 



where as before Ljv = Di + D2 + . . . + Djy. By the strong law of large numbers, for N ^ 00, 

N 



E[D], and l^/[A=i + l] ^P(I? = J + 1), a.s. 



1=1 
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so that a.s., 

gf^ ^{j + l)nD = 3 + l)/nD\=9j, iV^oo. (3.2) 

Therefore, the BP {-^^,^'} with offspring distribution {^j^'} is expected to be close to a BP with 
offspring distribution {gj} given in ()1.6() . Consequently, in Section we will couple the BP {-^^^'j 
to a BP {-2^^^'} with offspring distribution {^j}. This will allow us to prove Theorems 11.11 and 11.41 
in Section |1J 

Throughout the paper we use the following lemma. It shows that Ljv is close to ]E[Ljv] = /^-/V. 
Lemma 3.1 (Concentration of Ljv) For each 0<a<^, 6 = 1 — 2a and some constant c > 0, 



1 



Proof. The proof is immediate from the Chebychev inequality, since 



> N-" < cN-K (3.3) 




so that 6 = 1 — 2a > and c = ^^^j-^) 



< OO. □ 



(N) 



3.1 Coupling with a branching process with offspring g 

We will construct the SPG in such a way that we simultaneously construct a BP with offspring 
distribution {fl'j^'} in p.lf) . This BP is of course purely imaginary. The BP is coupled with the 
SPG such that it enables us to control their difference. 

As above, we will use the notation Z^^' and Zj^^ to denote the number of stubs attached to nodes 
at distance k — 1 from node 1, respectively, node 2, but not part of an edge connected to a node at 
distance k — 2. For k = 1, Zj^^ = Di. We start with a description of the coupling of the SPG with 
root 1, and a BP with offspring distribution g'^^^ given in 1)3. 1|) . The first stages of the generation 
of the SPG are drawn in Figure 01 We will explain the meaning of the labels 1, 2 and 3 below. 

We draw repeatedly and independently from the distribution {g^^^^}- This is done conditionally 
given Di , D2 , • • ■ , Dj^ , so that we draw from the random distribution (|3.H) . After each draw we will 
update the realization of the SPG and the BP, and classify the stubs according to three categories, 
which will be labelled 1, 2 and 3. These labels will be updated as the growth of the SPG proceeds. 
The labels have the following meaning: 

1. Stubs with label 1 are stubs belonging to a node that is not yet attached to the SPG. 

2. Stubs with label 2 are attached to the SPG (because the corresponding node has been chosen), 
but not yet paired with another stub. These are called 'free stubs'. 

3. Stubs with label 3 in the SPG are paired with another stub to form an edge in the SPG. 

The growth process as depicted in Figure El starts by giving all stubs label 1. Then, because we 
construct the SPG starting from node 1, we relabel the Di stubs of node 1 with the label 2. We 
note that Z^'' is equal to the number of stubs connected to node 1, and thus Z}^' = Di. We next 
identify Zj^' for j > 1. Zj^' is obtained by sequentially growing the SPG from the free stubs in 
generation Z^^\. When all free stubs in generation j — 1 have chosen their connecting stub, Zj^' 
is equal to the number of stubs labelled 2 (i.e., free stubs) attached to the SPG. Note that not 
necessarily each stub of Zj^]_-^ contributes to stubs of Zj^\ because a cycle may 'swallow' two free 
stubs in generation j — 1. This is the case precisely when a stub with label 2 is chosen. 

For the BP, we start with Zj^' = Di, and grow from the free stubs available in the BP tree by 
sequentially growing from the stubs (alike for the SPG). For the coupling, as long as there are free 
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SPG stubs with their labels 

A\ A A A I A A 




A A ^ A 



A 
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3 2 2 2 3 2 2 
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s A A 

2 3 2 2 



A A 




A 




3A ^ A 



A 



3A 



2 3 



Figure 3: Schematic drawing of the growth of the SPG from the node 1 with = 9 and the 
updating of the labels. The stubs without labels have label 1. The first line shows the different 
degrees. The growth process starts by choosing the first stub of node 1 whose stubs are labeled by 2 
as illustrated in the second line, while all the other stubs maintain the label 1. Next, we uniformly 
choose a stub with label 1 or 2. In the example in line 3, this is the second stub from node 3, whose 
stubs are labeled by 2 except for the second stub which is labeled 3. The left hand side column 
visualizes growth of the SPG by the attachment of stub 2 of node 3 to the first stub of node 1. Once 
an edge is established the paired stubs are labeled 3. In the next step, the next stub of node one is 
again matched to a uniform stub out of those with label 1 or 2. In the example in line 4, it is the 
first stub of the last node that will be attached to the second stub of node 1, the next in sequence 
to be paired. The last line exhibits the result of creating a cycle when the first stub of node 3 is 
chosen to be attached to the last stub of node 9 (the last node). This process is continued until 
there are no more stubs with labels 1 or 2. In this example, we have = 3 and Z!^^ = 6. 

stubs in both the BP and the SPG in a given generation, we couple the BP and SPG in the following 
way. At each step we will take an independent draw from all stubs, according to the distribution 
1)3. Since the stubs are specified by their label (1, 2 or 3), we can now present the construction 
rules for the BP and the SPG. 

1. If the chosen stub has label 1, then in both the BP and the SPG we will connect the present 
stub to the chosen stub to form an edge and attach the remaining stubs of the chosen node 
as children. We update the labels as follows. The present and chosen stub melt together to 
form an edge and both are assigned label 3. All 'brother' stubs (except for the chosen stub) 
belonging to the same node of the chosen stub receive label 2. 

2. In this case we choose a stub with label 2, which is already connected to the SPG. For the BP, 
the chosen stub is simply connected to the stub which is grown, and the number of free stubs 
is the number of 'brother stubs' of the chosen stub. For the SPG, a self-loop is created when 
the chosen stub and present stub are 'brother' stubs which belong to the same node. When 
they are not 'brother' stubs, then a cycle is formed. Neither a self-loop nor a cycle changes 
the distances in the SPG. Note that for the SPG two free stubs are used, while for the BP 
only one stub is used. This is illustrated in Figure 0J 

The updating of the labels solely consists of changing the label of the present and the chosen 
stub from 2 to 3. 

3. A stub with label 3 is chosen. This case is illustrated in Figure El This possibility of choosing 
an already matched stub with label 3 must be included for the BP which relies on the property 
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SPG 



BP 




Figure 4: Example of the coupling when a cycle occurs. Edges have twice the length of stubs. In 
the SPG the two dotted stubs in the left picture are to be connected. The middle picture gives the 
result of creating the cycle in the SPG where the bold line is the edge creating the cycle. The third 
figure draws the BP where the cycle is removed and the degree of the circled node is 3. 



SPG BP 




Figure 5: An example of the coupling where we need to perform a redraw. In the draw from g'^', 
we draw the dotted stub in the SPG with degree 3. In the BP, we keep this degree, while in the 
SPG we draw again from the conditional distribution given that we do not draw a stub with label 
3. In this example, this redraw gives the value D = 2. 

that all subsequent iterations in the process are i.i.d. Note that this includes the case where 
we draw the present stub, which of course is impossible for the SPG. 

The rule now for the BP is that the corresponding node with the prescribed number of stubs 
is simply attached. Since for the SPG, we sample without replacement, we have to resample 
from distribution 1)3. until we draw a stub with label 1 or 2. This procedure is referred to 
as a redraw. Since we sample uniformly from all stubs, the conditional sampling until we hit a 
stub with label 1 or 2 is also uniform out of the set of all stubs with labels 1 and 2, so that it 
has the correct distribution. Obviously there are two cases: either we draw a stub with label 

1 or one with label 2. When we draw a stub with label 1 in the SPG then we update as under 
rule 1 above, while when we draw a stub having label 2 in the SPG, we update as under rule 

2 above. 

Clearly, the redraws and the cycles cause possible differences between the BP and the SPG: the 
degrees of the chosen node are possibly different. We will need to show that the above difference 
only leads to an error term. 

The above process stops in the j^^ generation when there are no more free stubs in generation 
j — 1 for either the BP or for the SPG. When there are no more free stubs for the SPG, we complete 
the j^^ generation for the BP by drawing from distribution 1)3.1(1 for all the remaining free stubs. 
The labels of the stubs remain unchanged. When there are no more free stubs for the BP, we 
complete the j*^ generation for the SPG by drawing from distribution (|3.1() iteratively until we 
draw a stub with label 1 or 2. This is done for all the remaining free stubs in the j^^ generation of 
the SPG. The labels are updated as under 1 and 2 above. 

We continue the above process of drawing stubs until there are no more stubs having label 1 
or 2, so that all stubs have label 3. Then, the construction is finalized, and we have generated the 
SPG as seen from node 1. We have thus obtained the structure of the SPG, and know how many 
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nodes there are at a given distance from node 1. 

The above construction will be performed similarly from node 2. This construction is close to 
being independent as long as the SPG's from the roots 1 and 2 do not share any nodes. More 
precisely, the corresponding BP's are independent. Thus, we have now constructed the SPG's and 
bp's from both node 1 and node 2. 



3.2 Coupling with a BP with offspring distribution {dj^^} 

In the previous section, we have obtained a coupling of the SPG and the BP with offspring distribu- 
tion {gj^^}- In this and the next section, we will summarize bounds on the couplings that we need 
for the proof of Theorems 11.11 and 11.41 These results will be repeated in the appendix together with 
a full proof. We start with the coupling of the number of stubs Zj^' in the SPG and the number of 
children Zj^' in the j^^ generation of the BP with offspring distribution {gj'^^}- 

Proposition 3.2 (Coupling SPG with the BP with random offspring distribution) There 
exist rjjPyO, a>^ + r] and a constant C , such that for all j < + rj) log^ N , 

(1 - N-''u^)Z^''> < Zj'' < (1 + N-''iy^)Zf^^ > 1 - CjN-^. (3.4) 

3.3 Couphng with a BP with offspring distribution {gj} 

We next describe the coupling with the BP with offspring distribution {gj} and their bounds. A 
classical coupling argument is used (see e.g. [SSI)- Let X^^' have law {ffj^'} and X have law {gj}. 
We define Y^'^^ by 

CO ^ oo 

P(yW =n)= min(e\5n), HY^"' = oo) = 1 - J]min(e\fin) = 3 E \9n' " 9n\. (3.5) 

n=0 n=0 

Let X^""^ = y(^) when y^'^' < 00, and P(X('^) = n,y(^) = 00) = g^' - mm{g^n\gn), whereas 
X = X when y'^* < 00, and P(X = n,y(^' = 00) = c/„ - min(5^f' , c/„). Then X'^^ has law 5'^), 
and X has law g. Moreover, with large probability, X*^' = X due to Proposition 13.41 below. 

This coupling argument is applied to each node in the BP {Zj^^^}j>o and {ZP}i^Q. The BP's 
with offspring distribution {gj} will be denoted by {-2^i^^}i>o and {Z!^^''}i>Q. We can interpret this 
coupling as follows. Each node has an i.i.d. indicator variable which equals one with probability 



Pn 2 

n=0 



00 

^El^r-ffnl. (3.6) 



When at a certain node this indicator variable is 0, then the offspring in {Z^ }i>o or {Z| }j>o equals 
the one in {Zl^^}i>o or {ZP}i>o, and the node is successfully coupled. When the indicator is 1, 
then an error has occurred, and the coupling is not successful. In this case, the laws of the offspring 
of {Z^^^}i>o or {Z^^^}i>o is different from the one in {Z-^^}i>o or {Z^^^}i>o, and we record an error. 
Below we will use the notation Pjv to denote the conditional expectation given Di,D2, ■ ■ ■ , Dj^ and 
to denotes the expectation with respect to the probability measure Pjv Finally, we write 



00 



n=0 



In the following proposition, we prove that at any fixed time, we can couple the SPG to the 
delayed BP with law {gj}: 
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Proposition 3.3 (Coupling at fixed time) For any m G N fixed, there exist independent branch- 
ing processes Z^^\Z^'^\ such that 



lim P(Z«=^«) = 1. 



(3.8) 



In the course of the proof we will also rely on the following more technical claims: 

Proposition 3.4 (Convergence in total variation distance) There exist 02, /92 > such that 

00 

F{Y,in + l)\gir'-9n\>N-^')<N-^\ (3.9) 

Consequently, 



n=0 



and 



\Pm > N-""-') < N- 



-132 



(3.10) 



(3.11) 



Corollary 3.5 (Coupling of sums) There exist e, /3, ry > such that for all j < {1 + 2r]) log^, N, 
as N ^ 00, 

3 3 



7(2) 7C 
ri/21 ^ LV2J ^ R/21 ^ b 

4=1 i=l 



(2) 
/2J 



0{N' 



(3.12) 



4 Proof of Theorem 11.11 and 11.41 

The proof consists of four steps. 

1. We first express the survival probability P(-ffjv > j) in the number of stubs {Zj-*"^}, A; = 1, 2, of 
the SPG's. For j < (1 + 2r/) log^ A'^, where ij is specified in Proposition 13. 2( we will show that 



(^iV > j) 



E 



exp 



v^j+l 7(1) 7(2) 



(4.1) 



with 



'j+l 7(1) 7(2) v^r«/2l /7(i) I 7(2)N ■ 

RM^ij) =0\Y^ 1[!Z^/2J ^^==1 ^^'^ + ^ 



i=2 



LI 



2. We use ProDosition l3.2l to show that in ()4.1() we can replace {Zf }, i = 1, 2 by the BP {Zf }, i = 
1,2. The error term E[|i?Mjv(j)|] and the error involved in replacing the SPG by the BP is 
bounded by a constant times N~^, for some /? > 0, uniformly in j < (1 + 2r]) log^^ N. 



3. In this step we show that there exists /3 > such that for all j < (1 + 27]) log^ A^, as — > 00 

Z^i=2 ^[i/2]^li/2] 



HHn > j) 



E 



exp 



+ 0{N- 



(4.2) 



where Z 



(i) 

k ' 



1,2, denotes the delayed BP with offspring distribution (|1.6|) . 



4. We complete the proof of Theorem II. II and II. 4( using step 3, and the almost sure limit in (|1.7j) 
applied to Zn^ and Zn\ We finally use the speed of convergence of the above martingale limit 
result to obtain (|1.9j) . 
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Step 1: A formula for ¥{Hj^ > j). The following lemma expresses P(-f/^jv > j) in terms of 

the conditional probabilities given {Zi^^}^^^ and {Zs^^Yg^^. For / = 0, we only condition on 
r 7(1)1 fc 

Lemma 4.1 For j > 1, 

F{H^ > j) =K[l[qf^'''''^'^\H^ >i-i\H^>i- 2)J . (4.3) 

i=2 

Proof. We first compute that 

F{H^ > j) =K[q'^''\H^ > j)] = E[qf'\H^ > m^^''\H^ > j\H^ > 1)] . 

Continuing this further, and writing E^'*' for the expectation with respect to Q^''', 

Q^z''\H^ > 3\Hn > 1) = E^'" Wz"\H^ > j\H^ > 1)] 

= E^'^' [Qf'\H^ > 2\H^ > mf'\Hj, > j\H^ > 2)] . 

Therefore, 

¥{H^ > j) = E[Qf'\H^ > l)E^^-'^[Qf'\H^ > 2\H^ > mf'\Hj, > j\H^ > 2)]] 
= E[E^-^' [Q^z''\H^ > mf'^H^ > ^\Hn > mf'^H^ > jIHn > 2)]] 
= E[Q^^''\Hj, > mf'^H^ >2\H^> mf'\H^ > j\H^ > 2)] , 

where, in the second equality, we use that Q^'^'(ffjv > 1) is measurable with respect to the cr-algebra 
generated by Z[^''^\ This proves the claim for j = 2. 

More generally, we obtain that for k, I such that k + I < j — 1, 

qf'\H^ > j\H^ >k + i-i) = Ef'^ [q^^-'+'\H^ > j\H^ >k + l-l)] 

= Ef'^ [qf'+''\H^ >k + l\H^>k + l- i)qf'+'\H^ > j\H^ > k + l)], 

and, similarly, 

qf'\H^ > j\H^ > k+l-1) =Ef'^[q'-^+'''''{Hr, > k+l\H^ > k+i-i)q'-z*^''\H^ > j\H^ > k+l)]. 

In the above formulas, we can choose to increase A: or / by one depending on {Zs^''^^}g^i and 
{Zs^'^^'lg^]^. We will iterate the above recursions, until k + I = j — 1, when the last term becomes 
1. This yields that 

F{H^ > j) =E[f[qf^'^+''^'^''\H^ > i\H^ > i - 1)] . (4.4) 
1=1 

Renumbering gives the final result. □ 
We will next prove ()4.ip . In order to do so, we start by proving upper and lower bounds on 
the probabilities of not connecting two sets of stubs to each other. For this, suppose we have two 
disjoint sets of stubs A with |j4| = n and B with \B\ = m out of a total of L stubs. We match 
stubs at random, in such a way that two stubs form one edge, as in the construction of the SPG. 
In particular, loops are possible. 

Let p{n, m, L) denote the probability that none of the n stubs in A attaches to one of the m 
stubs in B. Then, by conditioning on whether we choose a stub in A or not, we obtain the recursion 

p(n, m, L) = j—^p{n - 2, m, L - 2) + ( 1 j—^ — j p{n - 1, m, L - 2) (4.5) 
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Since p{n — 2,m, L — 2) > p{n — l,m,L — 2), because we have to match one additional stub, we 
obtain 

m 



p{n, m,L) > f 1 — — ^ j pin — 1, m, L — 2) > ( 1 



L-2i-l 



(4.6) 



On the other hand, we can rewrite (|4.5|) as 
p{n, m, L) = { 1 



Tfl \ Tl — 1 

— Y j Pi''^ — 1, m, L — 2) + — {p[n — 2,m, L — 2) — p{n — l,m,L — 2)) . 



We claim that 

p{n — 2,m, L — 2) — p{n — l,m, L — 2) 



p{n-2,m-l,L-2) < 



L-3 



L-3 



(4.7) 



(4.8) 



Indeed, the difference p{n — 2,m, L — 2) — p{n — l,m, L — 2) is equal to the probability of the event 
that the first n — 2 stubs do not connect to B, while the last one does. By exchangeability of the 
stubs, this probability equals the probability that the first stub is attached to a stub in B, and the 
remaining n — 2 stubs are not. This latter probability is equal to ■^^p{n — 2, m — 1, L — 2). 
The equations (|4.7I) and (|4.8() yield 



p{n, m, L) < yl 
Iteration gives the upper bound 

p{n, m, L) < 



m 



L-1 



p{n — 1, m, L — 2) + 



n — 1 



m 



n-l 



n 1 



i=0 



m 



L-2i-l 



+ 



(L-1) (L-3)- 



(L-2n)2' 



(4.9) 



Since the event {iLjv > 1} holds if and only if no stubs of root 1 attaches to one of those of root 
2, we obtain, using (|4.6|) and (|4.9|) . that 



n (1 

Similarly. 



L. 



n 1 



i=0 



L„ - 2i - 1 



(L^-2Zf>)2- 

(4.10) 

(4.11) 



+ 



''\Hj, > 2\H^ > I) > II 1 



i=0 



^2 



with a matching upper bound with an error term bounded by ~^nT) — ] 



LjY — 2Z\ ^ — 2% — 1 



(2) -,2 5.(1) 



(Ljv-2Zj"^-2zP) 



We use that, for natural numbers n, m, M with M + n + m = o(L), 

n-l 



n 1 



i=0 



m 



L - M - 2^; - 1 



nm{M + n + m) 
L2 



L ^ oo. 



Using (gini), the bounds in (I^TTUI) yield 



7(1) 7(2) ^ / 7(1) 7(2) / 7(1) I 7(2)^ 

J^'^)(F,>l) = exp^-^i^l + 0^^i ^1 



L. 



L^ 



(4.12) 
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Similarly, we can conclude that, as long as (^A:^^ + ^k^) ~ o{Ljs,), we have 

Prom (|4.3jl and taking expectations, the main term in ()4.1|) is evident. For the error term, we obtain 
that, as long as E[=? (^^ + = "(Lj,), 



i+l 7(1) 7(2) V^R/2l/7(i) I 7(2)^ 

. J-^ AT 



and we will show at the end of step 2 that for all i < (1 + 2??) log A^, we have El=f i^k^ + ^k^) = 
o{Lpf) and that there exists a /5 > such that 

E[RM^{j)] = OiN-f"). (4.14) 



Step 2: Coupling of SPG to the BP with offspring {5^'^'}. We start by showing that for 
some P > and uniformly in i < (1 + 2?]) log^, N, the main term in 1)4. 1() satisfies 



E 



exp 



v^i+i 7(1) 7(2) 

Z^i=2 ^\ij2\^\ij2\ 

Lt-i 



E 



exp 



V^i + l 7(1) 7(2) 

Ln 



+ 0{N- 



(4.15) 



We will deal with the error term (|4.14jl at the end of this step. Bound 



E7(l) 7(2) _ y(l) y(2) I , y{l) | ^(2) _ ^(2) I ^(2) | ^(1) _ | 

^ ri/21 ^ LV2J ^ ri/21 ^ U/2J I - ^ R/21 I ^ LV2J ^ LV2J M ^ LV2J I ^ ^ R/21 I ' 



i=2 



i=2 



i=2 



By Proposition 13.21 and uniformly in j < (1 + 2r])\og^ N , we have, with probability exceeding 
l-0(iV-^log^ N), that 



max 



7(1) I 7(2) _ 7(2) I 7(2) I y 

l^^\i/2^\^\i/2\ ^\i/2\V l^^\i/2\\^ 



i+1 



(1) _ 7(1) I 

[i|2\\^\^/2^ ^\il2-\\ 



(2) 

Li/2J • 



i=2 



i=2 



Since a > | + 77, we have iy(5+'')i°s-^iV-" = Ar|+'?-° = iV-"i, for some ai > 0. Hence, for 
any e with < e < ai, where as before Pjv denotes the conditional probability given the degrees 
Di, D2, . . . , -Djv, and Ejv the expectation with respect to Pjv, we have 

< ©(iV-'^ log. N) + (^1 ^[32J ^f?2l > O (^"^"O) 



i+1 



< ©(iV-'^log.iV) + (iV^-i) ^E^[Z«21^ 



(2) 1 
[i/2]^li/2\i 



i=2 



The involved conditional expectation can be computed explicitly and we obtain 

Hi) 7(2) 

'li/2]^li/2\ 



Y^^[Z\^ Z^}2l] = D^D2Y^^^^'^''^N^'^~' = DiD2j2< < cD,D2ui, 



i=2 



i=2 



1=0 
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for some constant c. ProDOsition l3.4l implies that we can bound uff by (l+A^""^ y , with probabihty 
exceeding 1 — A^^'^^^ for some 02,^2 > 0, whereas Lemma lXD impHes L^^ can be replaced by (fxN)^^ 
with probabihty exceeding 1 — N~^^, for some (3-^ > 0. Putting this together we obtain after taking 
the expectation with respect to Di, D2, ■ ■ ■ , -Djv, 

^ (^L^\Y.^m\^im - ^\l}2\^\?/2] \ > ^''^ 

< 0{N-^ log. N) + 0{N-P^) + 0{N-P^) + 0{N-P-^) + O {^-^^^^^^^^^^^^) ■ 
Since 1^^ < N^+'^~^ for j < (1 + 2r/) log. N, we obtain 

^P^(^rE472l472j -^S^l^LV^jl >^^-^^ ^ (4-16) 

for some /3 > by taking (3, (32, (33,1] and e sufficiently small. For x — y small, and x, y > 0, we find 
e~y = + 0{x — y), so that 

{v^i+i 7(1) 7(2) ^ c v^i+i 7(1) 7(2) N 

ijv J I J 

with probability exceeding 1 — 0{N^^). In combination with the inequality < 1 for x > 0, we 
obtain (|irT5|) . 

We turn to the proof of (|4.14() and the assumption that i^k^ + ^k^) ~ o(-^jv)- From 

Proposition 13.21 and, uniformly in j < (1 + 2r/) log. A^, we have with probability exceeding 1 — 
0{N-'^log^N) that 

ri/21 ri/21 

^ (Z^^' + < (1 + 0(Ari+''-")) ^ {Zj!^ + ). (4.17) 

fe=l k=l 



SO that, for all i < j, 

/ ^\i/2] f 7(2) 

TO ( 2^k=i y^k + ^, 



— > A^^M < 0{N-^ log. iV) + (1 + 0(iV5+'?-"))E, 



7-3/4 
J-'iw 



Thus, in particular, using 1)4. 17() . Yl^j^li\^k'' + ^k^) — o{Lf^) on the above event. Bounding the 
expectation of Zj^\ we find for < e < 1/4 and for all « < i < (1 + 27]) log. N, 

P f ^fc=i V fc^7 fc ^ > j < Ar-/3 + (1 + o(iV~"i))— 3-^ = 0(iV-'3), 

for some (3 > 0. Hence, for ei > 0, 



(i+l 7(1) 7(2) V^R/2l/7(i) I 7{2)N \ /j+l 7(1) 7(2) 

i=2 y \i=2 ^iV 



By Proposition O the product ^[^'^21 ^Li/2J ^e bounded by (1 + 0(iV3+'?-"))i'(i)2] ^Li/2J 



^[Si=2 ■^ri/21 "^1^/21] — while L^J^ is of order N^^^. Therefore, we obtain from the Markov 



inequality that 



'i+l 7(1) 7(2) Y^R/21 c 7(1) I 7(2)^ 
^\i/2-]^li/2\ l^k=l ^^k +^k 



^ h/2| L2/2J ^fe=i V fc ' . ^ ^ 0(iV-/3), 



i=2 
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for some /? > 0. Since RMpf{j) is the difference of two numbers between and 1 and hence 
\RMpf{j)\ < 1, we obtain that, when ei > (3, 

( , i+i ri/21 \ 

mM.m < N-^^ + IP E ^Hm ^U72J E ^^l"' + > ^ 0{N-P). (4.18) 



^ i=2 k=l 



This proves (|4.14jl . 

Step 3: Coupling to the BP with offspring {gj}. Corollary 13.51 combined with Lemma l3.ll 
yields 



1=2 i=2 



Prom this result we obtain, as in the first half of step 2, 

Ki) 7(2) 



E 



exp 



Y^j+i 7(1) 7(2) >i -| r f v^j+i 57(1) 57(2) 

^i=2 '^U/2l^r/2l I _ I Z^i=2 ^[i/2]^[j/2j 



E 



exp 



+ o(iv-^), 



where, as before, /? is a generic small positive number. Using (|4.H) and the result of step it follows 
that 



F{H^ > i) = E 



exp 



Z^i=2 ^[i/2]^li/2\ 
Ln 



OiN 



-0\ 



To obtain ()4.2p . we finally replace, again at the cost of an additional term 0{N ^), the random 
number Lj^ by ^A^(l + O(Af')). 

Step 4: Evaluation of the limit points. We start from 1)4. 2() with j = k + aff<{l + 2r]) log^ N, 
where = Uogj^ N\ , to obtain 



F{H^ > cj^ + fc) = E 



exp 



+ 0{N 



(4.19) 



We write N = = p'^N-at, ^ where we recall that aj, = [log^ N\ - \og^ N. Then 



EO"JV+fc + l 57(1) 57(2) V^(TJV-t-fC-M 571-1-) 57i 

^=2 ^ri/2l^U/2j _ „,,a^+fc^»=2 ^rV2l^LV2j 



>(Tjv + fe+l 57(1) 57(2) 



2j/0-jv+/c 



In the above expression, the factor z^°^ prevents proper convergence. Without the factor fiu"''^'^^, 
we obtain from (|1.7() . with probability 1, 



lim 



v^cr^r+fc+l 57(1) 57(2) ,.,ni-,A,r^l 



I^- 1 



Using ()2.3() we conclude that for each a > 0, there is a /3 > such that 

> 0((log7V)-") ) = OiN-l^). 



^2j^o-jv+fc z/ - 1 

Hence, for /c < 27/logj^ and each a > 0, 

P(i7^ > + A;) = E(exp{-Ki/'^^+'=W'^'W>}) + 0((logiV)-"), 



(4.20) 
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where k = — 1). This proves ()1.9|) . 

We proceed by proving (|1.4|) . with Ra given in (|1.8j) . For this, we need to condition on node 
1 and node 2 being connected. Node 1 and node 2 are connected if and only if Hj^ < oo. Using 
(|4.20j) . for (fLl]) . it suffices to prove that 

P(i?jv < cx)) = g2 + o(l), where q = F{W'^^ > 0). (4.21) 

We prove (|4.21|) using upper and lower bounds. We note that, with k = ??log^ A^, 

¥{H^ < oo) > P(Fjv < fJjv + A:) = E(l - expf-Kz^^^+^^W'^'W*"'}) + 0((log iV)""). (4.22) 

Therefore, 

P(Fjv < cx)) > g2E(l -exp{-Ki/''^+'=W''>W>}|W<'>W<'> > 0) + 0((log iV)-°). (4.23) 

By dominated convergence, for k = 2r]logiyN, the conditional expectation converges to 1, so that 
indeed P(/^jv < cxd) > q"^ + o(l). For the upper bound, we rewrite, for any m, 

F{Hj^ < oo) = ¥{H^ < oo, Z^^Z^^ = 0) + ¥{H^ < oo, Z^^Z^^ > 0). (4.24) 

The second term is bounded from above by 

FiH^ < oo, > 0) < P(Z«Z(?) > 0) = P(Z«Z(^' > 0) + o(l) = + 0(1), (4.25) 

where we use Proposition l.S..Sl and we write qm = F{Z^^ > 0). When m — > oo, we have that qm — > q, 
so that we are done when we can show that for any m fixed, P(^/^jv < oo, Zm Zm = 0) = o(l). We 
note that if Zm Zm = 0, then H^^ < m — 1. Therefore, using (|4.2fl|) with k = m — a^^ — 1, we 
conclude 

P(Fjv < oo, Z^^Z^^ = 0) < ¥{H^ < m-1) = E(l-exp{-Kz^''^+'=W<'>W<'>}) +o(l) = o(l). (4.26) 

This completes the proof of (|4.21l) . We finally complete the proof of Theorems 11.11 and 11.41 using 
(|4.21j) . which, together with (|4.2fl|) . implies that, for k < 2r/log^ A^, 

¥{Hj, < CTjv + k\H^ < oo) = E(l - exp{-Kz^"^+^^W^) W^IIW^'^W^"' > O) + o(l). (4.27) 

n 

5 On the connected components 

In this section, we will investigate the sizes of the connected components and prove Theorem ESI 
Proof of Theorem 11.51 In the proof, we will make essential use of the results in |311 132| . where 
the statement in Theorem 11.51 is proved for certain degree sequences. Indeed, denote by 

N 

d,{N) = Y,nDj = i], i = 0,l,..., (5.1) 

i=i 

the degree sequence of our random graph G, where Di, D2, ■ ■ ■ , D^f is the i.i.d. sequence with distri- 
bution F introduced in (|1.1|) and satisfying (|1.2|) . In the bounds on the connected components 
in Theorem 11.51 are proved with only a lower bound on the largest connected component size, while 
in [Hlj, the asymptotic size of the largest connected component is determined. Both papers assume 
a number of hypotheses on the degree sequence {di{N)}i>Q. Thus, Theorem 11.51 follows when we 
can show that the probability that our degree sequences in (|5.1j) satisfy the restrictions is at least 
1 — 0(1). In fact, we need to alter the random graph G in a certain way to meet the conditions of 
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Molloy and Reed, and subsequently need to prove that the alteration does not affect the results. 
We now go over their conditions and definitions. 

Firstly, the degree sequence needs to be feasible, meaning that there exists at least one graph 
with the degree sequence. This is true, since Lj^ is even and we have that 

oo oo N oo N 

1=1 i=l j=l j=l i=l j=l 

Secondly, the degree sequence needs to be smooth, meaning that for some sequence Aj, we have 

N^oo N 

In our setting, this follows almost surely from the law of large numbers, with Xi = fi = P(D = i). 

Thirdly, and this is the most serious condition, the degree sequence needs to be well-behaved, 
meaning that it is smooth, feasible, and that for every e', there exists A^' = N'{e'), such that for all 
N > N', we have that 
1. 

sup|i(i-2)^^^-i(i-2)Ai| <e'; (5.2) 



N 



1. there exists i* 



|5]i(^-2)^-f;^(^-2)A.|<6'; (5.3) 

%=\ i=\ 

3. there exists an e > such that di{N) = for all i > [A^4~^]. 

We start with the last assumption, which is not satisfied by our graph. Indeed, the last restriction 
means that all nodes have degree at most [A'^*"'^] — 1. We will first alter the graph, and thus the 
degree sequences, in the following way. Fix e > small. For nodes j with Dj > \Ni~''~\ , we remove 
Dj — [A^*""^] + 1 edges. We do this by first removing in a uniform way edges between pairs i,j 
where the degrees of Di and Dj both exceed [A^4~^] — 1. When there are no more edges between 
nodes with degrees exceeding [A^^^"^] — 1, we remove edges uniformly from the nodes with degrees 
exceeding [A^4~^] — 1. Thus, we end up with a graph G' such that all degrees are at most [A^i"*^] — 1. 
Moreover each node j for which Dj > \N4~''~\ has degree equal to [A'^*"^] — 1 in the altered graph 
G' . This will be the graph to which we apply the results of Molloy and Reed. Let Dj be the degree 
of the node j in G' , and write d[{N) for the number of nodes with degree equal to i in G' . Then 
d'-{N) = for i > [A^4^'=], as required. 

We first compute the number of removed edges, which we denote by Rj^. Its expectation is 
bounded above by 

N 

nn^] < E[J2{Dj + i-\m-'])+i[D,>\m-^]]]<N ^ w{d,>i) 



< cN Yl r^+i =CAri-(^-2)(3-) < Art, 

for r > 3 and e sufficiently small. We are hence removing only a fraction of the Lj^ available 
edges and all degrees go down (see Lemma IXTl that Ljv is close to ^N). Moreover, with probability 
converging to one, we have that Rpf < 2Ni, since by a computation analogous to the one given 
above for E[i?jv], we have Var(i?iv) < G N^^^'^~^^^i~'^\ so that by the Chebychev inequality, 

F{Rj^ > 2N^) < F{\Rt, - E[Rn]\ > N^) < N-'^Yav{R^) < C7Ar-5-(^-3)(i-^) < CA^^i (5.4) 
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We start by checking H5.2() for the graph G', with Aj = fi in (jLlj) . For this, we wih use the 
fohowing bound from [Hi Corollary 1.4(i)], which states that if Spf is binomial with parameters N 
and p, and if x = {Np(l — p))^/^ > 1, then 

F{\Sj, - Np\ > x{Np{l - p))V2) < ie-^'/2^ (5.5) 

We first check condition (|5.2j) for i = [A^4~'=] — 1. By construction, we have that for i = [A^^^*^] — 1, 

4{N) = Y,d,{N). (5.6) 

Hence, d^{N) is a binomial random variable with parameters N and p = 1 — F{\N^~''~\ — 2). Thus, 
by (|LS1), with X = C^/logN, we have that 

¥{\4{N)-Np\ > C((logiV)iVp(l-p))^/2) < liV-^'/2. (5.7) 
Thus, we have that for i = \Ni-'] - 1, Xi = fi and p = 1 - F{\Ni-^] - 2), 

< i2^!^/pZ[i - Fi\m-^] - 2)]V2 + i2f^ + _ F{\m-^] - 2)] 

v 

logAf 
N 



for r > 3. This proves ((OI) for i = \N^~^] - 1. 

We next prove (|5.2() for i < \Ni~^~\ — 1. For this, we use the triangle inequality 

<^-2)|^^-A.|<z — +^ |^^-A.|, (5.8) 

and we bound these two terms separately. 

We start with the second term, and use (|5.5j) . which gives that 

F{\di{N) -Nfi\> C if iN log Ny/^) < iV-^'/^ (5.9) 

We will take C > 2, so that 

N 

¥{3i < \N^-'] - 1 : \di{N) - Nfi\ > C{fiN log N)^/^) < ^iV-^'/2 = N^-cV^ ^ (5,10) 

i=l 

On the complementary event, we have that 

I di{N) \ ^ n .2//ifog^^i/2 
sup Ml -2) — i{i-2)Xi\<C sup if — )' =o(l). (5.11) 

1 , iV 1 ^ iV 

Thus, we have bounded the second term in 1)5. 8p . We next turn to the first term in (|5.8() . First, we 
clearly have that \d'^{N) - di{N)\ < Rr^. Thus, since < 2Ni, 



dm di{N) 



N N 



- N ~ - - ' 
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for i < N^'\ For i > ivi"', we bound d^(iV) < Yli>idj{N), so that, again using 



^,d[{N) dm < ^ ^ ^^.(^) ^ 2i2(l - F{t - 1))(1 + o(l)) < 2ciV(l-)(3-) ^ 0. 



N 



To check 1)5. 3p . we first take i* fixed so that 

oo 

^ i(« - 2)Ai < e/2. (5.12) 



i=j*+l 

l2l 



This is possible, since E[Z) ] < cxd. Thus, we are left to show that 

,di{N) 



E^(^-2)|^-A.|<e72. (5.13) 



i=l 



In order to do so, we use the bound in (|5.1Up to obtain that 

f;>(. - 2)|*M _ ,.| <cj:e(!^f' < cii'fC^r- < .72, (5,14) 

i=l i=l 

whenever N is sufficiently large. The same result applies to d^{N), since \d[{N) — di{N)\ < i?jv, and 
= o{N), so that 



i=l 



Therefore, we have proved all conditions for the graph G' , and thus obtain the result in Theorem 
11.51 for G' . To complete the proof, we need to show that the result for G' implies the result for G. 

This implication is proved in several small steps. First, denote the largest connected components 
of G and G' by LCq and LGqi. Since G can be obtained from G' by adding the removed edges 
back, we obtain that (since we put back at most connected components of size at most 7 log N), 

\LCg'\ < \LCg\ < \LCg'\+ Rn -llogN. (5.15) 

3 

Thus, since |LCg'| = qN{l + o(l)) and Rp, < 2Ni with probability 1 + o(l), we obtain that 

qN{l + 0(1)) < \LCg\ < qN{l + o(l)) + 0(ivt log N) = qN{l + o(l)), (5.16) 

so that the largest connected component has size gA^(l + o(l)) with probability l + o(l), as claimed. 

To see that all other connected components in G have size at most 7 log A^, we note that in G' the 
removed edges are all connected to nodes with degree [A^*"*^] . We first show that with overwhelming 
probability these nodes are already in the largest connected component in G' . Since in G' only the 
largest connected component has at least nodes for any 6 > and since 7logA^ = o{N^), it 
suffices to check that nodes in G' with degree [A^*"*^] are connected to at least other nodes. 
Since the probability of picking a node different from the ones already connected to the node under 
observation is bounded from below by 1 — A^^^i""^)"^ (since all degrees in G' are bounded above by 
\N^~''~\), the probability that at most different nodes are chosen is bounded by the probability 
that a binomial random variable, with parameters p = 1 — N'^^*^''^^^ and n = \Ni~~''~\, is bounded 
from above by A^"^. By ()5.5() . this probability is negligible whenever 6 < \ — Thus, we may assume 
that all nodes with degree [A^4~'^] are in the largest connected component in G' . Therefore, we 
obtain that the nodes that must be added to G' to form G are attached to the largest connected 
component of G'. Thus, the size of the second largest connected component of G is bounded from 
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above by the size of the second largest connected component of G', which is bounded from above 
by 7 log N. □ 
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A Appendix. 

A.l Proof of Proposition 13.41 

In this part of the appendix, we prove Proposition 13.41 which we restate here for convenience as 
Prop osition I A . 1 .Tl At the end of this section, we restate and prove Corollarv 13.51 

Proposition A. 1.1 There exist 02,^2 > such that 

00 

P(^(n + l)|5r-5n| >iV-"^) <iV-^^ (A.1.1) 

n=0 

In the proof, we need the following lemma. 

Lemma A. 1.2 Fix r > 1. For each non-negative integer s, there exists a constant C > 0, such 
that 

n 

J](i + l)^/,+i < Cm-(--i-^) + C/i(n). (A.1.2) 

j=in 

where 

'0, s < r - 1, 

h{n) = < log(n + l), s = T — 1, 
(n + l)'*-^+i, s > r - 1. 

We defer the proof of Lemma lA. 1 . 21 to the end of this section. 
Proof of Proposition IA.1.11 Fix a,b,a > 0. Define 

Ljv 



11 < iV"" 



n 



} n I ^ Y,{Di + lfl[D, > N'^] < N-'^ (A.1.3) 

{ N'^ N 
- J](n + If I {m = n + 1] - I < iV-n . 
n=0 i=l J 



The constants a, b and a will be chosen appropriately in the proof. The strategy of the proof is as 
follows. We will prove that 

P(F'=)<7V~^2^ (A.1.4) 
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for some P2 > 0, and that on F, 

^(n + l)br-5n|<iV-"^ (A.1.5) 



n=0 



for some 02- This proves Proposition lA. 1 .Tl We start by showing (|A.1.5|) . 
We bound 



n=0 n=0 



+ l)|5r - 9n\ < X;(n + 1)1^' " ^9n\ + + - l\. (A.1.6) 



The second term is bounded by {v + 1)A^ " by the first event in F. The first term in ()A.1.6|) can 
be bounded, for N sufficiently large, as, again using the first event in F, 

00 j.^ ^00 TV 

J2{n + l)\gr--^9n\ = — J](n + l)2|^(/[A = n + l]-/„+i)| (A.1.7) 

n=0 n=U «=1 

00 AT 

n=0 i=l 

We next split the sum over n into n > N"" and n < N"" for some appropriately chosen o E (0, 1]. On 



F, the contribution from n < is at most -N whereas we can bound the contribution from 

n > TV" by 



00 TV 00 

^ E + E(^[^« = n + 1] + fn+i) = ^ ^^(A + > iVi + - ^ (n + 

n=Af° i=l ^ 1=1 ^ n=N'^ 

For r > 3, the second term is bounded by CN^"'^'^^'^^ by Lemma lA.1.21 The first term is bounded 
by ^N~^ by the second event in F. Thus, we obtain (|A.1.5|) with 02 = min{6, a(r — 3)}. 

We now prove ()A.1.4|) . For this, we use that F is an intersection of three events which we will 
write as Fi, F2 and F3, so that 

P(F^) < P(Ff ) + P(F|) + P(F|). (A.1.8) 

The first probability is bounded by P(Ff ) < c-N^°'~^, by LemmaO For P(F|), we use the Markov 
inequality, to obtain that 

P(F|) < N^K[{Di + lfl[Di > N""]] < iV^'-«(^-3)^ (A.1.9) 

by Lemma UT2I For P(F|), we use in turn the Markov inequality, Cauchy-Schwarz in the form 

^n=o ^ri < (l2n=o ^n=o ^n)^) ^^'^ the Jensen inequality applied to x 1-^ ^/x (a concave function), 
to obtain 

AT" AT 



P(F3^) < N'-'E[Y,{n + l?\Y.{l[D,=n + l]-fn+i)\] (A.LIO) 

n=0 i=l 

TV" N 

< N'~HN- + 1)^e( j;(n + 1)^( 5; (/[A = n + 1] - Z^+i))')' ' 

ra=0 1=1 

N"- N 

< IN"^^/'-' ^(n + 1)^E( Y: m = n+l]- ) 

n=0 i=l 



n=0 
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where in the last inequahties, we have used Lemma lA. 1.21 and 

N / ^ \ 

E[( J^[I[A = n + 1] - fn+i])^] = Var K^/[A = n + 1] = iV/n+i(l - fn+i) < iV/n+i- 

i=l \i=l / 

Thus, we obtain the statement in Proposition lA. 1 .Tl with 

/?2 = min{l/2-6-amax{l,6-r}/2,a(r-3) -6, (2a - 1)}. 

By picking first b small, and then a small, we see that a2, (^2 > 0. □ 

Remark A. 1.3 When il.^) holds for some r > 2 (rather than t > 3), then the above proof can be 
repeated to show that 

oo 

HY.\9'n' - 9n\ > N-^') < N-''^ (A.1.11) 



n=0 



Indeed, in the definition of the event F in iA.l.!^) . we can replace (A + l)^ by (A + l) in the second 
event, and (n + 1)^ by (n + 1) in the third event. Then, by adapting the above argument, the event 
F implies that XlnLo Is'™^' ~ 9n\ < A^~"^. The proof that ¥{F'^) < N^^'^ can be adapted accordingly. 



Proof of Lemma IA.1.2L Define a density f{x) = X^jt=o fj^j ^ x < j -t 1], and the corresponding 
distribution function F[x) = Jq f{u) du. Then for integer-valued j > 0, 

i^(j) = /o + ... + /,-i = i^(j-l), F{j-l)<F{x)<F{j), + 



Moreover 

n- /■n+2 /■n+2 

j;(j + l)7,+i< / x7(x)dx = -/ x'd{l-F{x)). 

j—^ Jm+l Jm+1 

Using partial integration and the upper bound 

1 - F{x) < 1 - F{j - 1) < c{j - 
for x G (j, j + 1), we conclude that 

" /■n+2 

V(i + l)7j+i < (m + 1)^(1 -F(m + l))-(n + 2)^(1 -F(n + 2)) + / (1 - F(x)) dx^ 



< C 



r-n+l 

m^+'-^+ / y'-^dy 



This yields the upper bound. □ 

We finally prove Corollarv 13.51 In order to do so, we first formulate and prove an intermediate 
result. This result will be followed by the reformulation of Corollary 13.51 which now becomes 
Corollarv IA.1.51 and its proof. 

Proposition A. 1.4 There exist e, P,r] > such that for all j < + rj) log^ N, as N ^ oo, 

'^1 E^i^' - E > N-^) = 0{N-^). (A.1.12) 

'^^^ i=i i=i 
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Proof. Let 

oo 

= {^nbr - <?n| < iV-"n, (A.1.13) 

n=0 

then according to Proposition lA. 1 .H we have F{F^) < N~^^. We claim that for all i > 1, 

i 

E^lZf^ - Zf'l < max{u - a^,Uj, - a^} E^[i^^](max{zy, zv^})*"'", (A.1.14) 

m=l 

where 



oo oo oo 

On = ^nmm{gn,gl^^} = v -^n{gn- mm{gn, gl^^}) = Vn -^^{g!^^ - min{g„, 5r<f' }) . 

n=0 n=0 n=0 

(A.1.15) 

We first prove (IA.1.141) . For / Zf\ the couphng is not successful in at least one of the 
generations m,l < m < i. Let m be the first generation for which the coupling is unsuccessful. 
There are at most Zm nodes for which the coupling can fail. If the coupling fails for a node, the 
expected difference between the offspring of that node is bounded above by max{i/ — a^-, i^n — On}- 
Finally, from generation m + 1 on, we again have two BP's with laws g and g'-^\ so that the expected 
offspring is bounded by (max{z/, i/jv})*"™"- This demonstrates the claim ()A.1.14p . 
Furthermore, since Eiv[^m^] = Diu"^'^, we end up with 

Eivl^i^' — Z-^^\ < max{i/ — Oiv, i^iv — ajv}^^i(inax{z/, z^jv})*~^- (A. 1.16) 

By ()A. 1.13(1 . on Fj^ we have that 

max{zy - aN,i^N - a^} < '^n\gn - 5n^'| < N~ 

n 

"^"^"'""^ =1 + ^-' max{0, Y: n{gn - 9^} = 1 + 0{N- 



-«2 



-02) 



Hence, for j < + rj) log^, N, using the abbreviation 
we have 

< N-^^+E[N'E^[T^If^]]. 
From ()A.1.16|) and the estimates on Fj^, we obtain 



1 

N 



i=l 

j 



i=l 

L(|+^)log. N\ 

< ^i7V=+''-"2 J2 i(l + 0(7V-"2))»-i 

1=1 

< ^^N'+^-^^ ■ {log.Nf • iv(^+^)iog.(i+0(iV-2))^ 
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using that for x = 1 + 0{N "2) > 1, we have X^ILi ^-^^ ^ — This proves the proposition since 

logj^ N ■ A^(2+^) ^°Sy(i+0(A^ "2)) bounded by any smah power of A^, and e and rj can both be 

taken arbitrarily small, whereas 02 > 0. 

We finally restate and prove Corollarv 13.51 □ 

Corollary A. 1.5 There exist e, /3, > such that for all j < (1 + 2r]) log^, N , as N ^ 00, 



1=1 



) 57(2) _ 7(2) 

Z^^li/2]^li/2\ 
1=1 



> N- 



0{N- 



(A.1.17) 



Proof. Bound 



Y^i 7(1) 7(2) 

l^i=l^li/2]^li/2\ 



7(1) 7(2) 
l^i=l^li/2]^li/2\ 



N 



< 



2^i=l ^[i/2j 



7(1) ^ 



+ 



Y^J 7(1) ^57(2) _ 7(2) \ 
2^i=l^\i/2']^^[i/2i 



NVN 



(A.1.18) 



Both terms on the right hand side of (|A.1.19|) can be treated as in the proof of Proposition IA.1.41 
because the processes with sources (1) and (2) are independent and uniformly in i < + r/) log^ A^, 



max 



(1)1 



A^ 



max{Ar'',Ar'' • (1 + 0(A^-°2)) 



on Fpf. The right-hand side can again be bounded by any small power of A'^ by taking rj arbitrarily 
small. We omit further details. □ 



A. 2 Proof of Proposition 13.21 

In this second part of the appendix, we restate our main result on the coupling between the SPG 
and the BP with offspring distribution {gl^^} once more and give a full proof. 

Proposition A.2.1 There exist r/,/3>0, Q>^+r? and a constant C , such that for all 
j< (5 + r?)log,Ar, 

(1 - A^-"zy^)Zf ' < < (1 + A^-"i/^)Zf ^) > 1 - CjN-P. (A.2.1) 



This proof is divided into several lemmas. It is rather involved, and we may think of Proposition 
IA.2.11 as one of the key estimates of the paper. We start with an explanation of the different steps 
in this proof. 

The proof of Proposition IA.2.11 proceeds by induction with respect to j. Note that for all 
j <{\+ V) logi, A^, we have N'^^i'^ < A^(2+^)-° — > 0, as A^ — > 00 and when a > rj. When at level 
J — 1, the event in the statement of the proposition holds, we have 

1 7(1) 7(1) I ^ !^ 7(1) 

so that we control the difference between the number of stubs -^jl^i and the number of children 
Zj^\. The absolute value of this difference is bounded by Zj^\ times a fraction that converges to 
0. For generation j we have to control the difference Zj^' — Zj^\ Differences in generation j arise 
from differences in generation j — 1 and from drawing stubs with label 2 or label 3. If a label 2 stub 
is chosen, then the SPG will contain a loop or cycle and hence no free stubs in level j are created, 
whereas in the BP a non-negative number of offspring is attached. If a label 3 stub is chosen, then 
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the corresponding node with described number of children is attached in the BP, whereas for the 
SPG we have to resample until we draw a stub labeled 1 or 2. Hence, if Z^p > Zj^\ so that the 
number of free stubs attached to nodes at distance j — 1 of the SPG exceeds the number of children 
in generation j of the BP, then this overshoot can only be caused by drawing label 3 stubs. The 
number of stubs with label 3 is bounded by the total number drawn in the SPG, i.e., by 

i=l i=l i=l 

For Zj^' < Zj^\ the number of stubs with level 2 or 3 both matter and their total amount is bounded 
by 

i=l i=l i=l i=l 

In both cases the probability of drawing a label 2 or 3 stub is bounded by 



, < , (A.2.2) 

on the event where Yli=i < N^^^ . Using that Ljv is of order E[Liv] = /xA^ (see Lemma 
this probability is sufficiently small to allow us to use Chebychev's inequality. 

The main lemmas in this section are Lemma I A . 2 . 71 and Lemma lA.2.9[ Together, they prove the 
induction step described above. Lemmas IA.2.21 up to IA.2.6] are preparations, the most important 
one being Lemma lA.2.61 This lemma shows that if the total progeny up to and including generation 
j of {Zf^} is larger than N^~^ , for some 5 > 0, then with overwhelming probability also each of 
the sizes of the last two generations, i.e., Z^^ and Zp\ exceed N2~'^^, 

As before, we will abbreviate the conditional probability and expectation given Di, . . . , D^f by 
Pjv and Ejv. 

Lemma A.2.2 For < rj < ^ and all j > 1, 

p4^r/^j^E^f^^^^~'')^(£7]v)' ^^•'•2) 

Lemma lA . 2 . 21 together with Lemma l3 . 1 1 prove Proposition I A . 2 .11 for all j such that the total size of 
the BP is at most Tvi"'?. 

Proof. We denote by I the first stub which is grown differently in the SPG and in the BP. Assume 
that this l^^ stub is in the j^^ generation or earlier. 

Before the growth of the l^^ stub, the BP and the SPG are identical. Thus, we must have that 
^ <El=i4^'^- Hence, as we reach to the l^^ stub, the number of stubs having either label 2 or 3 is 
bounded above by Yli=i -^f ' < N2-'^. A difference in the SPG and the BP can only arise when we 
draw a stub for the BP having label 2 or 3. Thus, the probability that the stub is the first to 
create a difference between the SPG and the BP is bounded above by N2~^/Lm- Therefore, 



'(^^'/^^E^r'^A^n< E 



□ 

Recall that z^jv = X^J^o^^^"^' expected offspring of the BP {Z'^^}j under P^r. Note from 

Proposition lA. 1 .11 that is close to with probability close to one. In the statement of the next 
lemma, we write 

D'f' = max A- (A.2.4) 

l<i<Af 
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Lemma A. 2. 3 For every 7 > 0, 

Proof. We use Boole's inequality to obtain from (|1.2j) that 



N 



i=l 



(A.2.5) 

(A.2.6) 
□ 



Lemma A. 2. 4 For r],5 € l)? '^^^ i ^ (l + ^) logj, N , there exists /32 > suc/i that 



(A.2.7) 



i=l 



Proof. By Proposition 13.41 we can include the indicator that |z^jv — < A this explains the 
additional error term N~^^. By the Markov inequality, we obtain for j < + rj) log^ A, 

j j 



The expectation on the right-hand side can be computed by conditioning: 

E[Zf'/[|i.^ - z^l < A-"^]] = E[E^[Zf'/[|i.^ - z^l < A-2]] 



Hence, 



1=1 



+ A^^^y - 1 

+ A-"2) - 1 



< A~^2 +C7A''-''. 



□ 



In the lemma below, we write d for a random variable with discrete distribution {gri'^} given in 
(|3.H) . and Varjv((i) for the variance of d under Pjy. Furthermore, we let, for any < a < ^, 



Aff = ^jv(a,7,a2) 



1 



< A-M n < A^^} n -iy\< A-°2}, 



then, according to Proposition 13.41 Lemmas 13. II and lA. 2. 31 we have 

P(^^^) = 0(A-^), 

where e = b A ((r — 1)7 — 1) A /32 > whenever 7 > l/(r — 1). On A^, we have 

1 A 1 

Hil + N-'^) - - A"'*)' 

This will be used in the following lemma. 



(A.2.^ 



(A.2.9) 
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Lemma A. 2. 5 For every 7 > 0, 

E(Var^.(d)/[A^]) < C7iV(4— )+7^ 

where x"*" = max(0, x). 

Proof. Since the variance of a random variable is bounded by its second moment, 

00 Af 2/ ■ . N 



(A.2.10) 



n=0 



n=0 j=l 



Var^(<i) < Y: nVn^' = E E ^^^J^^D, = - + ^ 7^ E ^1' 
and so, for r G (3, 4], 



E(Var^(d)/[^^]) < Y^E[—Dp[A^]\ < —E[D^I[D < N"']] < C E i'f^ < 



i=l 



by Lemma lA.1.21 For r > 4, the third moment of D is finite, and the result is also true even without 

the indicator I[D^^^ < iV^]. □ 

Lemma A. 2. 6 For all — 2r/) logj, N < j < + 2r/) log^^ A^, there exists 6,P > such that 

j 

P( E ' ^ ^^^^ ^j-i ^ N^-^^) < CN^f^, (A.2.n) 
j 

P( E ^ ^ ^^~^'^) < CN-^. (A.2.12) 



Remark: The statements of the lemma are almost identical, the difference being that the index of 
Zj^\ in the first statement is replaced by the index j in the second statement. We will be satisfied 
with a proof for the first statement only, the proof with index j is a straightforward extension. 
Proof. Since Y2i=i — , there must be an i < j < + 2r]) log^, N such that for N large 

enough 



(i + 27?) log, A 



1 3 i 

> N2~2. 



We write I for the first i < j such that Zl > N'2 2 . It suffices to bound 

j j 



(A.2.13) 



i=l k=l 



The contribution from I = j - 1 is 0. When / = j, then Zj'^ > At"^"^, but Z"^'^^ < At"^" so that 
from the Markov inequality 



■h-25 



k=l 

< E 



(A.2.14) 



< n'^+¥e 



N~2 + 2^E 



< CA"2 + 2^A2- 
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Thus, we are left to deal with the cases where / < j — 1. Then, there exists an i < j — 1 such 
that ' > iV5-|'5, but Z^'\ < iV^-2<5. Thus, there must be a first s > i such that < Zi'\ 

Consequently, Zs^^ > Z!^' > A^2~2^. We will bound, uniformly in s, 



(A.2.15) 



for some f3 > 0. This proves (|A.2.11|) . since the total number of possible i and s with i < s < j is 
bounded by (log,^ A^)^. 

We use Lemma I A . 2 . 31 to see that we may include the indicator on Aj^ for any 7>l/(r — 1). We 
will use the Chebychev inequality and Lemma lA.2.5l to obtain that 



{zil,<zi^\z^/>>m-l\A^) 



(A.2.16) 



= E 

< E 

< E 



< C7A(4— )+7-|+f<5 < N-P^ 



with C = 2{v- 1)-^ and since (4 — r)+7 < 1/2 and 5 > can be taken arbitrarily small. 

We are now ready to give the proof of Proposition IA.2.'T1 
Proof of Proposition IA.2.11 

We first set the stage for the proof by induction in j. Fix rj < S < 2r], and a > ^ + r], and define 



□ 



Ej = {Vi < j : (1 - A-"zy*)Zf > < Zf > < (1 + iV-"i/*)Zf ' }• 
We will prove by induction that for all j < + rj) log^ N, 

P(^|) < CjN~f^, 



(A.2.17) 



(A.2.18) 



which implies Proposition IA.2.'T] bv taking the complementary event. First, by Lemma lA.2.21 and 
lA. 2. 41 and since rj < S we see that it is sufficient to prove for j < + r/) log^ A^, 

j 

1=1 

For j < — 2r]) log^ A, we bound 

a^^-^ < ^ < < p( ^ zf ^ > A^-^) < N-^ + CA-2^+^ 



i=l 



i=l 



by the Markov inequality and using Proposition 13.41 in a similar way as in Lemma IA.2.41 Hence, 
the statement in ()A.2.18|) follows for j < — 2r]) log^ A. This initializes the induction in j. 
To advance the induction, we bound 

j j 

i=l 

j 



i=l 



i=l 
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where the last inequahty follows by the induction hypothesis. Thus, it suffices to prove that 

F{E^ n Ej_^) < CN-^, (A.2.19) 

where 

3 

Note that 

E^nE'j_, = ({zw < (i-iV-°z.J)zf }n£;^_i) J ({z^ > (i + iV-"z.J)zf }n£:j_i). (A.2.20) 

We write the disjoint events on the right-hand side of (|A.2.20|) as Ej^ and Ej^ and bound the 
probability of these events separately. We will start with Ej ^ . This result is stated in the following 
lemma: 

Lemma A. 2. 7 There exists /? > such that for all — 2r/) log^ A'" < j < + 77) log^ N , 

¥{El^) < CN'^. (A.2.21) 

Proof. We note that on Ej ^, we have that 

j j i 

< 1^(1 + i^*iV-")Zf ' < (1 + A^+^A-") Y ' < 2A^+^ 

i=l i=l 1=1 

because a > ^ + Thus, for every stub which is grown simultaneously for the BP and the SPG, 
there is a probability bounded from above by 2A2+^/Ljv that a difference is created between the 
BP and the SPG (such a difference is called a miscoupling) . Denote by U the number of stubs 
where such a difference occurs. Then, U is bounded from above by a binomial random variable with 
n = A^2+'^ and p = 2N^~^^ / Lf^. Thus, by the Markov inequality, we have, 

Fn{U > A'^) < . 

Using ()A.2.9|1 . we obtain, for 26 < a, 

F{U > N") < CN-^^^^ + A"^ < A"^. (A.2.22) 

Observe that differences between Zj^' and Zj^' can only arise through (i) different numbers of 
stubs in the (j — 1)*^* generation, and (ii) differences created in the j^^ generation which we previously 
called miscouplings. In the first case, the difference in the number of stubs is bounded from below 
by an independent draw from g'-^K A miscoupling occurs if we draw a stub with label 2 or 3. Hence, 

(yW 7(1) \ + 

^f-^f>- Yl d,-Ydi, (A.2.23) 

i=l 1=1 

where {dj}j>i are independent draws from (7*^' and {di}i>i are draws conditionally on drawing a 
stub labeled 2 or 3. On we have that 

{Z^'\ - Z^'\)^ < N-'^u^-^Z'^^, (A.2.24) 

so that on -E'j<, introducing the notation a^vj = N~^v^~^ Z''^\, 

(yW 7(1) "1 + 

Ydi + Y^i^ Yl d^ + Ydi> N-^'u^Zfl (A.2.25) 

1=1 i=l i=l i=l 



36 



Combining this with (|A.2.22|) and using the definition of cxpfj, we see that in order to prove 
(|A.2.21|) it suffices to show that 

P({ ^ + ^ > iV-"zy'Z]'^} n < CN'^. (A.2.26) 



i=l i=l 



We will first show that on Ej_^ the term X]i=i small compared to N~°'u^ Zj , if we choose 

a sufficiently small. On Ej_i, we have ^^=1 Z^^'' > N^^^ ^ and so, with probability larger than 
1 — CN~^ , according to Lemma lA.2. 61 we have that also Zj^' > N^^'^^. Hence, 

[AT"] [N°- 



2 25 



i=l i=l 



where 7 = 1 — 2r/ — a — 25 — a< ^, but can be taken arbitrary close to ^. Since r > 3, we then 
have that cN^~(^-'^h < N'/^. 

Hence it suffices to prove the statement in ()A.2.26p without the term di, that is, it suffices 
to prove 

^di> -N~'^v^Zf\E'^A < CN~^. (A.2.27) 

■ i=l 
^(1) 

Since we can write Zj^' = X^jii^^i' and, using again Lemma lA.2.61 we have that E'-_^ implies 
^f-i — -/V^~2'5^ with probability larger than 1 — CN~^, it is sufficient to prove that 



(1 - N-"u^) Y,di> -N-'^u^ d,, Zf_-^ > N^-^^) < N-^. (A.2.28) 

4 = 1 i=OlN 7+1 



^(1) 

Now Ejv[(i] = z/jv and, given Z^^li, the variance of YliLi^i'^i ~ ^n) equals Z^^^^aXf^{d). Therefore, 
by the Chebychev inequality. 



7(1) 



i=l i=aN 1+1 



7(1) 



^ 4Z]'iiVar^(di) _ 4Varjv((ii) 



7(1) 



(i/^a^,j(z/ - 1))2 zf\N~^'^iy^i {1 - v-^^vl 
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We use Lemma IA. 2 .51 Hence, by intersecting with the event /[^at] and its complement, and using 
(|A.2.8j) . we obtain for j > - 2r/) log^ A^, 



7(1) 

aN,j 

2 2<5 

i=l i=apfj+l 



(1 - N-'^u^) ^di> N-'^u^ Yl ^ 



<CiN-' + C2- 1 

by fixing a > ^ + so that the exponent is negative (using that 7 < ^ and (4 — r)+ < 1), and 
writing /3 = ^ — 2a — 26 — A-q — [4 — r)"'"7 > 0. This proves HA.2.28|) and completes the proof of 
Lemma rOTl □ 

Before turning to the proof of the bound on F{Ej^) in Lemma lA.2.91 below, we start with a 
preparatory lemma and some definitions. Suppose we have L objects divided into A^ groups of sizes 
di, . . . , djv, so that L = YliLi ^i- Suppose we draw an object at random, and we define a random 
variable by d/ — 1 when the object is taken from the group. This gives a distribution g^''\ i.e., 



1 ^ 

glf^ = -Y,d^I[d^ = n+l]. (A.2.29) 



i=l 

Clearly, 5^^) = 5*^), where D = {Di, . . . ,D^). 

We next label M of the L objects, and suppose that the distribution g'-''-^{M) is obtained in 
a similar way from drawing conditionally on drawing an unlabelled object. More precisely, we 
remove the labelled objects from all objects thus creating new d'^, . . . ,d'^, S = L — M, and we let 
g(<i) = g'-'^\ Even though this is not indicated, the law f/*'*^ (M) depends on what objects have 
been labelled. 

Lemma lA . 2 . 81 b elow shows that the law 17*'*' (M) can be bounded above and below by two specific 
ways of labeling the M objects. Before we can state the lemma, we need to describe those specific 
labellings. 

For a vector d, we let . . . , d(jv) be the ordered vector, so that d^i-, = minj=i^...^Ar dj and 
(i(iv) = maxj=i^...^Ar dj. Then the laws f^^{M) and /i''*'(M), respectively, are defined by successively 
decreasing di^^) and respectively, by one. Thus, 

= E ".-^w- - " + '1 + ~ - r ' ^ " ^ " 

i=l 
i=2 

For /''*^(M) and h'-^^M), respectively, we repeat the above change M times. Here we note that 
when = 1, and for we decrease it by one, that we only keep the di > 1. Thus, in this 

case, the number of groups of objects is decreased by 1. 

Finally, we write that f ^ g when the distribution / is stochastically dominated by g, i.e., when 
Y17=o fi — Y17=o 9i fo'^ n > 0. Similarly, we write that X ^ Y when for the probability mass 
functions fx , /y we have that fx ^ fr- 

We next prove stochastic bounds on the distribution g*'*'(M) that are uniform in the choice of 
the M labelled objects. 
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Lemma A. 2. 8 For all choices of M labelled objects 

f^\M) ^ g^'^\M) < h^'^\M). (A.2.32) 

Thus, the expectation and variance of the random variable X{M) with probability mass function 
g'-'^^{M) are bounded by 

E[X{M)] <E[X{M)], Yar[X{M)] <E[X{Mf], (A.2.33) 

where X{M) has probability mass function /i''*^(M). 

Moreover, when Xi, . . . , Xi are draws from g'-'^\Mi) , . . . , g'-'''^ (Mi) , where the only dependence 
between the Xi resides in the labelled objects, then 

I I I 

i=l i=l i=l 

where {2Li}i=i o.^^^ {-'^i}i=i; respectively, are i.i.d. copies of X_ and X with laws f^''\M) and h^''^{M) 
for M = max'^-^ Mi, respectively. 

In the proof of Proposition IA.2.'T1 we wih only use the upper bounds in Lemma lA.2.81 
Proof. In order to prove HA.2.32|) . we will use induction in M. We note that /''''(O) = 5*'*'(0) = 
/i('')(0) = g'-'^\ and this initializes the induction. To advance the induction, we note that we need to 
investigate the effect of labelling one extra object. For /''''(M), we need to maximize the cumulative 
distribution function, whereas for h^^{M), we need to minimize it. Clearly, ()A.2.30IIA.2.31|) are 
optimal. This advances the induction. The statement in ()A.2.33|) follows from ()A.2.32|) 

To prove HA.2.34|) . we see that for every j, conditionally on the 'past' (Xi, . . . , the 
random variable Xj is stochastically bounded by X_j and Xj, respectively. This completes the proof 
of Lemma Em □ 

Lemma A. 2. 9 There exists (3 > such that for all j < + rj) log^^ N, 

P(£;|>) < CN-f^. (A.2.35) 

Proof. The proof of Lemma fA.2.9l follows the proof of Lemma fA.2.71 and we focus on the differences 
only. 

Let V denote the number of stubs out of the Zj^]_-^ stubs that are attached to stubs with label 
3 in the BP. Since for each stub in the (j — 1)*^* generation, on Ej_i, we have that there are at 
most 2X;ti ^ ^ 2iV2+'5 stubs with label 3, we have that V is bounded from above by a binomial 
random variable with n = N2~^^ and p = 2N^~^^ / Lj^. Thus, by the Markov inequality, we have 
that for any a > 26, 

F{V > N"-) < CN-'^, with /3 = a - 25 > 0, (A.2.36) 

where we can take a arbitrarily small by choosing 6 > small. 

We thus assume that V < N'^. We next proceed by investigating F{Ej^). Now, on Ej^ n£'j_i, 
we have that 

> (1 + N-''u^)Z^'\ (A.2.37) 

Thus, Zj^^ is larger than Zj^\ We note that Zj^' can only become larger than Zj^^ from (a) a redraw 
and the redraw exceeds the original draw from g'^'; and (b) stubs in Zj^-,^ that are not in Zj^]_^ 
which give rise to new stubs. On Ej^i, we thus have that (recalling that a^j = N~'^u^~^Zj^]_-^) 

0!N,j V 

^r-^-''<E^^ + E<' (A-2-38) 
i=l 1=1 
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where d[ , d'l are drawn from the appropriate conditional distributions given that we pick a stub with 
label unequal to 3. 

We note that each of the d[,d'l is obtained by drawing from stubs conditionally on labels not 
being 3. Since the total number of stubs labeled 3 is throughout the growth process bounded above 
by 2Eti < 2m+\ oiiV < iV", we obtain that by Lemma IXTsl and are 

bounded above by a^j + [A^'*] independent copies of Xj(2A2^'^), where for any M, Xi{M) has 
probability distribution h^^^M). 

We note that by ()A.2.33|) and Proposition 13.41 the expectation of Xi{2N'i^^) is bounded above 
hy V + A~°2 fQ]^ some 02 > 0, and the variance of Xi{2N2~^^) obeys the same bound as Varjv((i) in 
Lemma IA.2.51 Thus, we can copy the remaining part of the proof from the proof of Lemma lA.2.71 
□ 

A. 3 Proof of Proposition 13.31 

In this section, we prove Proposition 13.31 In fact, we will prove a slightly different result, as 
formulated in the next proposition. This proposition summarizes the coupling results, and will be 
instrumental both in this paper, as well as in j25j, in which we investigate the case where r E (2, 3). 

Proposition A. 3.1 Fix t > 2, and assume that M.I3\) holds. For any m such that, for any 7] > 
small enough, 

m 

P(^Z« > A'?) = 0(1), (A.3.1) 

there exist independent branching processes Z^^\Z'-^\ such that 

hm P(Z« = ZW) = L (A.3.2) 

Af— >cx) 

Remark: For fixed m, by the Markov inequality, (|A.3.1|) indeed holds. Therefore, Proposition 13.31 
follows from ()A.3.2|) . We are left to prove Proposition IA.3.'ll 

Proof. By (TOU) . it suffices to show that F{Z^ = , Y,]li Zf <N'^) = l + o(l). For this, we 
use Lemma lA.2.2l to conclude that, for rj < 1/2, 

m m 

P(Z« = ^ if < A^) = P(Z« = ZW, J] < A") + 0(1). (A.3.3) 

i=i j=i 

By the coupling between Zm and Zm , a miscoupling occurs with probability equal to ppf defined 
in 1)3. 6(1 . Therefore, by Remark IA.1.31 the probability of a miscoupling for the offspring of a given 
individual is bounded from above by A""^ with probability 1 + 0{N^^'^). On the event that 
X]j=i -^j'^ < ^"^j the number of individuals that need to be coupled is bounded from above by A^^. 
We thus obtain that for any rj < 02, 

m 

P(ZW / ZW, ^ Zf < N'',p^ < A-"2) < ^ 0(1), (A.3.4) 

i=i 

which completes the proof. □ 
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